

<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
  <meta charset="utf-8">
  
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  
  <title>2.14 合并拼接字符串 &mdash; python3-cookbook 3.0.0 documentation</title>
  

  
  
  
  

  

  
  
    

  

  <link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
  <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <link rel="index" title="Index" href="../genindex.html" />
    <link rel="search" title="Search" href="../search.html" />
    <link rel="next" title="2.15 字符串中插入变量" href="p15_interpolating_variables_in_strings.html" />
    <link rel="prev" title="2.13 字符串对齐" href="p13_aligning_text_strings.html" /> 

  
  <script src="../_static/js/modernizr.min.js"></script>

</head>

<body class="wy-body-for-nav">

   
  <div class="wy-grid-for-nav">

    
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search">
          

          
            <a href="../index.html" class="icon icon-home"> python3-cookbook
          

          
          </a>

          
            
            
              <div class="version">
                3.0
              </div>
            
          

          
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>

          
        </div>

        <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
          
            
            
              
            
            
              <ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../chapters/p01_data_structures_algorithms.html">第一章：数据结构和算法</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../chapters/p02_strings_and_text.html">第二章：字符串和文本</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="p01_split_string_on_multiple_delimiters.html">2.1 使用多个界定符分割字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p02_match_text_at_start_end.html">2.2 字符串开头或结尾匹配</a></li>
<li class="toctree-l2"><a class="reference internal" href="p03_match_strings_with_shell_wildcard.html">2.3 用Shell通配符匹配字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p04_match_and_search_text.html">2.4 字符串匹配和搜索</a></li>
<li class="toctree-l2"><a class="reference internal" href="p05_search_and_replace_text.html">2.5 字符串搜索和替换</a></li>
<li class="toctree-l2"><a class="reference internal" href="p06_search_replace_case_insensitive.html">2.6 字符串忽略大小写的搜索替换</a></li>
<li class="toctree-l2"><a class="reference internal" href="p07_specify_regexp_for_shortest_match.html">2.7 最短匹配模式</a></li>
<li class="toctree-l2"><a class="reference internal" href="p08_regexp_for_multiline_partterns.html">2.8 多行匹配模式</a></li>
<li class="toctree-l2"><a class="reference internal" href="p09_normalize_unicode_text_to_regexp.html">2.9 将Unicode文本标准化</a></li>
<li class="toctree-l2"><a class="reference internal" href="p10_work_with_unicode_in_regexp.html">2.10 在正则式中使用Unicode</a></li>
<li class="toctree-l2"><a class="reference internal" href="p11_strip_unwanted_characters.html">2.11 删除字符串中不需要的字符</a></li>
<li class="toctree-l2"><a class="reference internal" href="p12_sanitizing_clean_up_text.html">2.12 审查清理文本字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p13_aligning_text_strings.html">2.13 字符串对齐</a></li>
<li class="toctree-l2 current"><a class="current reference internal" href="#">2.14 合并拼接字符串</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#id2">问题</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id3">解决方案</a></li>
<li class="toctree-l3"><a class="reference internal" href="#id4">讨论</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="p15_interpolating_variables_in_strings.html">2.15 字符串中插入变量</a></li>
<li class="toctree-l2"><a class="reference internal" href="p16_reformat_text_to_fixed_number_columns.html">2.16 以指定列宽格式化字符串</a></li>
<li class="toctree-l2"><a class="reference internal" href="p17_handle_html_xml_in_text.html">2.17 在字符串中处理html和xml</a></li>
<li class="toctree-l2"><a class="reference internal" href="p18_tokenizing_text.html">2.18 字符串令牌解析</a></li>
<li class="toctree-l2"><a class="reference internal" href="p19_writing_recursive_descent_parser.html">2.19 实现一个简单的递归下降分析器</a></li>
<li class="toctree-l2"><a class="reference internal" href="p20_perform_text_operations_on_byte_string.html">2.20 字节字符串上的字符串操作</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../aboutme.html">关于</a></li>
</ul>

            
          
        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">

      
      <nav class="wy-nav-top" aria-label="top navigation">
        
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../index.html">python3-cookbook</a>
        
      </nav>


      <div class="wy-nav-content">
        
        <div class="rst-content">
        
          















<div role="navigation" aria-label="breadcrumbs navigation">

  <ul class="wy-breadcrumbs">
    
      <li><a href="../index.html">Docs</a> &raquo;</li>
        
          <li><a href="../chapters/p02_strings_and_text.html">第二章：字符串和文本</a> &raquo;</li>
        
      <li>2.14 合并拼接字符串</li>
    
    
      <li class="wy-breadcrumbs-aside">
        
            
            <a href="../_sources/c02/p14_combine_and_concatenate_strings.rst.txt" rel="nofollow"> View page source</a>
          
        
      </li>
    
  </ul>

  
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
            
  <div class="section" id="id1">
<h1>2.14 合并拼接字符串<a class="headerlink" href="#id1" title="Permalink to this headline">¶</a></h1>
<div class="section" id="id2">
<h2>问题<a class="headerlink" href="#id2" title="Permalink to this headline">¶</a></h2>
<p>你想将几个小的字符串合并为一个大的字符串</p>
</div>
<div class="section" id="id3">
<h2>解决方案<a class="headerlink" href="#id3" title="Permalink to this headline">¶</a></h2>
<p>如果你想要合并的字符串是在一个序列或者 <code class="docutils literal notranslate"><span class="pre">iterable</span></code> 中，那么最快的方式就是使用 <code class="docutils literal notranslate"><span class="pre">join()</span></code> 方法。比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">parts</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;Is&#39;</span><span class="p">,</span> <span class="s1">&#39;Chicago&#39;</span><span class="p">,</span> <span class="s1">&#39;Not&#39;</span><span class="p">,</span> <span class="s1">&#39;Chicago?&#39;</span><span class="p">]</span>
<span class="gp">&gt;&gt;&gt; </span><span class="s1">&#39; &#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">parts</span><span class="p">)</span>
<span class="go">&#39;Is Chicago Not Chicago?&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="s1">&#39;,&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">parts</span><span class="p">)</span>
<span class="go">&#39;Is,Chicago,Not,Chicago?&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="s1">&#39;&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">parts</span><span class="p">)</span>
<span class="go">&#39;IsChicagoNotChicago?&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>初看起来，这种语法看上去会比较怪，但是 <code class="docutils literal notranslate"><span class="pre">join()</span></code> 被指定为字符串的一个方法。
这样做的部分原因是你想去连接的对象可能来自各种不同的数据序列(比如列表，元组，字典，文件，集合或生成器等)，
如果在所有这些对象上都定义一个 <code class="docutils literal notranslate"><span class="pre">join()</span></code> 方法明显是冗余的。
因此你只需要指定你想要的分割字符串并调用他的 <code class="docutils literal notranslate"><span class="pre">join()</span></code> 方法去将文本片段组合起来。</p>
<p>如果你仅仅只是合并少数几个字符串，使用加号(+)通常已经足够了：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">a</span> <span class="o">=</span> <span class="s1">&#39;Is Chicago&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">b</span> <span class="o">=</span> <span class="s1">&#39;Not Chicago?&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">a</span> <span class="o">+</span> <span class="s1">&#39; &#39;</span> <span class="o">+</span> <span class="n">b</span>
<span class="go">&#39;Is Chicago Not Chicago?&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>加号(+)操作符在作为一些复杂字符串格式化的替代方案的时候通常也工作的很好，比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="s1">&#39;{} {}&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="p">))</span>
<span class="go">Is Chicago Not Chicago?</span>
<span class="gp">&gt;&gt;&gt; </span><span class="k">print</span><span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="s1">&#39; &#39;</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span>
<span class="go">Is Chicago Not Chicago?</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>如果你想在源码中将两个字面字符串合并起来，你只需要简单的将它们放到一起，不需要用加号(+)。比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">a</span> <span class="o">=</span> <span class="s1">&#39;Hello&#39;</span> <span class="s1">&#39;World&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">a</span>
<span class="go">&#39;HelloWorld&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
</div>
<div class="section" id="id4">
<h2>讨论<a class="headerlink" href="#id4" title="Permalink to this headline">¶</a></h2>
<p>字符串合并可能看上去并不需要用一整节来讨论。
但是不应该小看这个问题，程序员通常在字符串格式化的时候因为选择不当而给应用程序带来严重性能损失。</p>
<p>最重要的需要引起注意的是，当我们使用加号(+)操作符去连接大量的字符串的时候是非常低效率的，
因为加号连接会引起内存复制以及垃圾回收操作。
特别的，你永远都不应像下面这样写字符串连接代码：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">s</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">parts</span><span class="p">:</span>
    <span class="n">s</span> <span class="o">+=</span> <span class="n">p</span>
</pre></div>
</div>
<p>这种写法会比使用 <code class="docutils literal notranslate"><span class="pre">join()</span></code> 方法运行的要慢一些，因为每一次执行+=操作的时候会创建一个新的字符串对象。
你最好是先收集所有的字符串片段然后再将它们连接起来。</p>
<p>一个相对比较聪明的技巧是利用生成器表达式(参考1.19小节)转换数据为字符串的同时合并字符串，比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">data</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;ACME&#39;</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mf">91.1</span><span class="p">]</span>
<span class="gp">&gt;&gt;&gt; </span><span class="s1">&#39;,&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">d</span><span class="p">)</span> <span class="k">for</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">data</span><span class="p">)</span>
<span class="go">&#39;ACME,50,91.1&#39;</span>
<span class="go">&gt;&gt;&gt;</span>
</pre></div>
</div>
<p>同样还得注意不必要的字符串连接操作。有时候程序员在没有必要做连接操作的时候仍然多此一举。比如在打印的时候：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">print</span><span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="s1">&#39;:&#39;</span> <span class="o">+</span> <span class="n">b</span> <span class="o">+</span> <span class="s1">&#39;:&#39;</span> <span class="o">+</span> <span class="n">c</span><span class="p">)</span> <span class="c1"># Ugly</span>
<span class="k">print</span><span class="p">(</span><span class="s1">&#39;:&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">]))</span> <span class="c1"># Still ugly</span>
<span class="k">print</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">sep</span><span class="o">=</span><span class="s1">&#39;:&#39;</span><span class="p">)</span> <span class="c1"># Better</span>
</pre></div>
</div>
<p>当混合使用I/O操作和字符串连接操作的时候，有时候需要仔细研究你的程序。
比如，考虑下面的两端代码片段：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Version 1 (string concatenation)</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">chunk1</span> <span class="o">+</span> <span class="n">chunk2</span><span class="p">)</span>

<span class="c1"># Version 2 (separate I/O operations)</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">chunk1</span><span class="p">)</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">chunk2</span><span class="p">)</span>
</pre></div>
</div>
<p>如果两个字符串很小，那么第一个版本性能会更好些，因为I/O系统调用天生就慢。
另外一方面，如果两个字符串很大，那么第二个版本可能会更加高效，
因为它避免了创建一个很大的临时结果并且要复制大量的内存块数据。
还是那句话，有时候是需要根据你的应用程序特点来决定应该使用哪种方案。</p>
<p>最后谈一下，如果你准备编写构建大量小字符串的输出代码，
你最好考虑下使用生成器函数，利用yield语句产生输出片段。比如：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">sample</span><span class="p">():</span>
    <span class="k">yield</span> <span class="s1">&#39;Is&#39;</span>
    <span class="k">yield</span> <span class="s1">&#39;Chicago&#39;</span>
    <span class="k">yield</span> <span class="s1">&#39;Not&#39;</span>
    <span class="k">yield</span> <span class="s1">&#39;Chicago?&#39;</span>
</pre></div>
</div>
<p>这种方法一个有趣的方面是它并没有对输出片段到底要怎样组织做出假设。
例如，你可以简单的使用 <code class="docutils literal notranslate"><span class="pre">join()</span></code> 方法将这些片段合并起来：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">text</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">sample</span><span class="p">())</span>
</pre></div>
</div>
<p>或者你也可以将字符串片段重定向到I/O：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">part</span> <span class="ow">in</span> <span class="n">sample</span><span class="p">():</span>
    <span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">part</span><span class="p">)</span>
</pre></div>
</div>
<p>再或者你还可以写出一些结合I/O操作的混合方案：</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">combine</span><span class="p">(</span><span class="n">source</span><span class="p">,</span> <span class="n">maxsize</span><span class="p">):</span>
    <span class="n">parts</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">size</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">for</span> <span class="n">part</span> <span class="ow">in</span> <span class="n">source</span><span class="p">:</span>
        <span class="n">parts</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">part</span><span class="p">)</span>
        <span class="n">size</span> <span class="o">+=</span> <span class="nb">len</span><span class="p">(</span><span class="n">part</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">size</span> <span class="o">&gt;</span> <span class="n">maxsize</span><span class="p">:</span>
            <span class="k">yield</span> <span class="s1">&#39;&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">parts</span><span class="p">)</span>
            <span class="n">parts</span> <span class="o">=</span> <span class="p">[]</span>
            <span class="n">size</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">yield</span> <span class="s1">&#39;&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">parts</span><span class="p">)</span>

<span class="c1"># 结合文件操作</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">&#39;filename&#39;</span><span class="p">,</span> <span class="s1">&#39;w&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="k">for</span> <span class="n">part</span> <span class="ow">in</span> <span class="n">combine</span><span class="p">(</span><span class="n">sample</span><span class="p">(),</span> <span class="mi">32768</span><span class="p">):</span>
        <span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">part</span><span class="p">)</span>
</pre></div>
</div>
<p>这里的关键点在于原始的生成器函数并不需要知道使用细节，它只负责生成字符串片段就行了。</p>
</div>
</div>


           </div>
           
          </div>
          <footer>
  
    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
      
        <a href="p15_interpolating_variables_in_strings.html" class="btn btn-neutral float-right" title="2.15 字符串中插入变量" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
      
      
        <a href="p13_aligning_text_strings.html" class="btn btn-neutral" title="2.13 字符串对齐" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
      
    </div>
  

  <hr/>

  <div role="contentinfo">
    <p>
        &copy; Copyright 2017, 熊能.

    </p>
  </div>
  Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/rtfd/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>. 

</footer>

        </div>
      </div>

    </section>

  </div>
  


  

    <script type="text/javascript">
        var DOCUMENTATION_OPTIONS = {
            URL_ROOT:'../',
            VERSION:'3.0.0',
            LANGUAGE:'None',
            COLLAPSE_INDEX:false,
            FILE_SUFFIX:'.html',
            HAS_SOURCE:  true,
            SOURCELINK_SUFFIX: '.txt'
        };
    </script>
      <script type="text/javascript" src="../_static/jquery.js"></script>
      <script type="text/javascript" src="../_static/underscore.js"></script>
      <script type="text/javascript" src="../_static/doctools.js"></script>

  

  <script type="text/javascript" src="../_static/js/theme.js"></script>

  <script type="text/javascript">
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script> 

</body>
</html>